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Abstract 

In [33j Thorup and Zwick came up with a landmark distance oracle. Given an n-vertex 
undirected graph G = {V, E) and a parameter k = 1,2,..., their oracle has size 
and upon a query (u, u) it constructs a path 11 between u and v of length 6{u,v) such that 
dG{u,v) < 6{u,v) < {2k — l)dG{u,v). The query time of the oracle from [33] is 0{k) (in 
addition to the length of the returned path), and it was subsequently improved to 0(1) 

[36l US]. A major drawback of the oracle of [33] is that its space is Q{n ■ logn). Mendel and 
Naor [23] devised an oracle with space and stretch 0{k), but their oracle can only 

report distance estimates and not actual paths. In this paper we devise a path-reporting 
distance oracle with size stretch 0{k) and query time 0(n^), for an arbitrarily 

small e > 0. In particular, for k = logn our oracle provides logarithmic stretch using linear 
size. Another variant of our oracle has size 0(n log logn), polylogarithmic stretch, and query 
time O(loglogn). 

For unweighted graphs we devise a distance oracle with multiplicative stretch 0(1), ad¬ 
ditive stretch 0{(d{k)), for a function /?(•), space ©(n^+^Z^ • /?), and query time 0{n‘^), for 
an arbitrarily small constant e > 0. The tradeoff between multiplicative stretch and size in 
these oracles is far below Erdos’s girth conjecture threshold (which is stretch 2A; — 1 and size 
0(n^'’'^/^)). Breaking the girth conjecture tradeoff is achieved by exhibiting a tradeoff of 
different nature between additive stretch /3(fc) and size 0(n^"’“^/^). A similar type of tradeoff 
was exhibited by a construction of (1 -|- e, /3)-spanners due to Elkin and Peleg [18]. However, 
so far (1 -|- e,/3)-spanners had no counterpart in the distance oracles’ world. 

An important novel tool that we develop on the way to these results is a distance¬ 
preserving path-reporting oracle. We believe that this oracle is of independent interest. 

*A preliminary version of this paper was published in SODA’15 |19j . 
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Introduction 


1.1 Distance Oracles for General Graphs 

In the distance oracle problem we wish to preprocess a weighted undirected n-vertex graph G = 
(y,E). As a result of this preprocessing we construct a compact data structure (which is called 
distance oracle) T>{G), which given a query pair {u,v) of vertices will efficiently return a distance 
estimate 6{u,v) of the distance dciu^v) between u and v in G. Moreover, the distance oracle 
should also compute an actual path n(M,n) of length 6{u,v) between these vertices in G. We say 
that a distance oracle is path-reporting if it does produce the paths n(M, v) as above; otherwise we 
say that it is not path-reporting. 

The most important parameters of a distance oracle are its stretch, its size, and its worst-case 
query time|i] The stretch a of a distance oracle T’(G) is the smallest (in fact, inhmum) value such 
that for every u,v G V, ddu, v) < 6(u,v) < a ■ dciu, v). 

The term distance oracle was coined by Thorup and Zwick [33]. See their paper also for a 
very persuasive motivation of this natural notion. In their seminal paper Thorup and Zwick [33] 
devised a path-reporting distance oracle (henceforth, TZ oracle). The TZ oracle with a parameter 
k = 1,2 ,... has size 0{k ■ stretch 2k — 1 and query time 0{k). As argued in [33], this 

tradeoff between size and stretch is essentially optimal for k < as Erdos’ girth conjecture 

implies that space is required for any k. Note, however, that k ■ 77,1+iA = ^^(n • logn), 

and Thorup and Zwick [SB] left it open if one can obtain meaningful distance oracles of linear size 
(or, more generally, size o(n log n)). 

A partial answer to this question was provided by Mendel and Naor [23], who devised a 
distance oracle with size stretch 0{k) and query time 0(1). Alas, their distance oracle 

is inherently not path-reporting. Specihcally, the oracle of [23] stores a collection of 0{k ■ n^/^) 
hierarchically-separated trees (henceforth, HSTs; see [8] for its dehnition), whose sizes sum up 
to The query algorithm for this oracle can return paths from these HSTs, i.e., paths 

which at best can belong to the metric closure of the original graph. These paths will typically 
not belong to the graph itself. 

One can try to convert this collection into a collection of low-stretch spanning trees of the input 
graph G using star-decomposition or petal-decomposition techniques (see DEli). However, each 
of this spanning trees is doomed to have n — 1 edges, making the size of the entire structure as 
large as VL{k ■ (In addition, with the current state-of-the-art techniques with low-stretch 

spanning trees one can only achieve bounds which are somewhat worse than the optimal ones 
achievable with HSTs. Hence the approach that we have just outlined will probably produce an 
oracle with stretch Ci;(fc), while using space 0{k ■ 

Another result in this direction was recently obtained by Elkin, Neiman and Wulff-Nilsen mi. 
For a parameter f > 1 their oracle uses space 0{n-t) and provides stretch 0{\Jt-r?l'^^) for weighted 
graphs. The query time of their oracle is Oilogt ■ log.^ Wmax), where Wmax is the aspect ratio of 
the graph, i.e., the ratio between the heaviest and the lightest edge. For unweighted graphs their 
oracle exhibits roughly the same behavior. For a parameter e > 0 it uses space 0{n ■ t/e) and 
provides stretch 0{t ■ -|- 

^The query time of all path-reporting distance oracles that we will discuss is of the form 0{q -|- |n|), where 11 is 
the path returned by the query algorithm. To simplify the notation we will often omit the additive term of 0(|n|). 
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The distance oracles of El are the hrst path-reporting oracles that nse o{n log n) space and 
provide non-trivial stretch. However, their stretch is by far larger than that of the oracles of 
[3^ |23] . Therefore the tantalizing problem of whether one can have a linear-size path-reporting 
distance oracle with logarithmic stretch remained wide open. In the current paper we answer this 
question in the affirmative. For any k, ^ < logn, and any arbitrarily small constant 

e > 0, our path-reporting distance oracle has stretch 0{k), size and query time 0{n^). 

(When e > 0 is subconstant the stretch becomes 0{k) ■ Hence our oracle achieves an 

optimal up to constant factors tradeoff between size and stretch in the range jpg^Qg^ < k < log n, 
i.e., in the range ’’missing” in the Thorup-Zwick’s result. Though our query time is for an 
arbitrarily small constant e > 0 is much larger than Thorup-Zwick’s query time, we stress that all 
existing path-reporting distance oracles either use space Vt{n ■ logn) [33l |36l [I3] or have stretch 
(The query time of the TZ oracle was recently improved to 0(1) in [36l [I3].) The 
only previously existing path-reporting distance oracle that achieves the OTtimal tradeoff in this 
range of parameters can be obtained by constructing a {2k — l)-spanneio with 0{'n}^^^^) edges 
and answering queries by conducting Dijkstra explorations in the spanner. However, with this 
approach the query time is Our result is a drastic improvement of this trivial bound 

from 0{‘n}^^^^) to 0(n'^), for an arbitrarily small constant e > 0. 

We also can trade between the stretch and the query time. Specifically, a variant of our oracle 
uses 0(?7, log logn) space, has stretch 0(log^°®‘‘/3 ^ n) ~ 0(log®'™n) and query time O(log logn). For 
a comparison, the path-reporting distance oracle of na with this stretch uses space r2(n • jpgi^g^ ) 
and has query time O(log logn ■ log^^ wmax)- 

We also remark that using a super-constant (but not trivial) query time is a common place by 
now in the distance oracles literature. In particular, this is the case in the oracles of Porat and 
Roditty [5U], Agarwal and Godfrey [S] and of Agarwal et ah [B]. 

1.2 Distance Oracles with Stretch {a, P) for Unweighted Graphs 

We say that a distance oracle T’(G) provides stretch {a, (3) for a pair of parameters a > l,/3 > 0 
if for any query {u,v) it constructs a path n(n,n) of length 6{u,v) which satisfies dG{u,v) < 
S{u,v) < a- dciu, v) + (3. The notion of {a, /9)-stretch is originated from the closely related area of 
spanners. A subgraph G' = (V, H) is said to be an {a, (3)-spanner oi a graph G = {V, E) , H E, 
if for every pair n, n G H, it holds that dniu, v) < a ■ dciu, v) -h (3. 

This notion was introduced in [18], where it was shown that for any e > 0 and k = 1,2,.. 
for any n-vertex unweighted graph G = {V, E) there exists a (1 -1- e, /3)-spanner with 0{(3 ■ 
edges, where (3 = /3{e,k) is independent of n. Later a number of additional constructions of 
(1 -|- e, /3)-spanners with similar properties were devised in [151 ESI 129] . 

It is natural to attempt converting these constructions of spanners into distance oracles with 
a similar tradeoff between stretch and size. However, generally so far such attempts were not 
successful. See, e.g., the discussion titled ’’Additive Guarantees in Distance Oracles” in the intro¬ 
duction of [25]. Patra§cu and Roditty [25] devised a distance oracle with stretch (2,1) and size 
0{n^^^), and query time 0(1). Abraham and Gavoille [1] generalized the result of [25] to devise 
a distance oracle with stretch {2k — 2,1) and space (The query time in [1] is 

unspecihed.) 

^For a parameter t > 1, G' = {V, H) is a t-spanner of a graph G = {V,E), H C E, if dH{u, v) < t ■ dG{u, v). 
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Note, however, that neither of these previous results achieves multiplicative stretch o{k) with 
size at the expense of an additive stretch. (This is the case with the result of [18] in the 

context of spanners, where the multiplicative stretch becomes as small as 1 + e, for an arbitrarily 
small e > 0.) In this paper we devise the first distance oracles that do achieve such a tradeoff. 
Specihcally, our path-reporting distance oracle has stretch {0{1), (3{k)), space 0{(3{k) ■ 

(3{k) = and query time 0(n^), for an arbitrarily small e > 0. The multiplicative stretch 

0(1) here is a polynomial function of 1/e, but it can be made much smaller than k. (Think, e.g., 
of e > 0 being a constant and k being a slowly growing function of n.) We can also have stretch 
{o{k),/3(k)), space 0{f3{k) ■ and query time where 7 > 0 is a universal constant. 

(Specifically, the theorem holds, e.g., for 7 = 1/7.) 

In both these results the tradeoff between multiplicative stretch and size of the oracle is below 
Erdos’ girth conjecture barrier (which is stretch 2k — 1 and space In fact, it is known 

that when the additive stretch is 0 , distance oracles for general n-vertex graphs that have size 
( 7 (^i+i/fc) payg multiplicative stretch ^l{k) [33l[22],[2T|. Our results, like the results of [18] 
for spanners, break this barrier by introducing an additive stretch /3{k). To the best of our 
knowledge, our distance oracles are the first distance oracles that exhibit this behavior. 

Using known lower bounds we also show that there exist no distance labeling schemes with 
stretch (0(1), /5(/c)) and maximum label size 0{(3{k) ■ (Rather one needs labels of size 

for this.) This is also the case for routing schemes. (See Section|2]for relevant dehnitions.) We also 
show that in the cell-probe model of computation any distance oracle for unweighted undirected 
n-vertex graphs with stretch (0(1),/9(/c)) and space 0{(3{k) ■ n^+^Z^) has query time fl{k). This 
is in contrast to distance oracles with multiplicative stretch, which can have constant query time 

mm- 


1.3 Distance Oracles for Sparse Graphs 

A central ingredient in all our distance oracles is a new path-reporting distance oracle for graphs 
with 0{n) edges. The most relevant result in this context is the paper by Agarwal et ah [ 6 ]. In 
this paper the authors devised a (not path-reportingj^ linear-size distance oracle which given a 
parameter k = 1,2,... provides distance estimates with stretch Ak — 1, uses linear space and has 
time (Their result is, in fact, more general than this. We provide this form of their 

result to facilitate the comparison.) In this paper we present the hrst path-reporting linear-size 
distance oracle for this range of parameters. Specihcally, our linear-size oracle (see Corollary 
16.4p has stretch 0(/c*°®'‘/3^) and query time for any constant parameter k of the form 

k = (4/3)^ h = 1 , 2 ,.... 

1.4 A Distance-Preserving Path-Reporting Distance Oracle 

In [H] the authors showed that for any n-vertex graph G = (U, E) and a collection V of P pairs 
of vertices there exists a subgraph G' = (U, H) of size 0(max{n -|- ^/n ■ P, \fP ■ n}) so that for 
every {u,v) G P, dH{u,v) = dG{u,v). In this paper we devise the hrst distance-oracle counterpart 
of this result. Specihcally, our distance oracle uses 0{n + P"^) space, and for any query {u,v) G P 

•^It was erroneously claimed in [^ that all their distance oracles are path-reporting. While their distance oracles 
with stretch smaller than 3 are path-reporting (albeit their space requirement is superlinear), this is not the case 
for their oracles with stretch 4fc — 1, A: > 1 |4]. 


4 



it produces the exact shortest path 11 between u and v in 0(|n|) time, where |n| is the number 
of edges in If. 

We employ this distance oracle very heavily in all our other constructions. 

Remark: The construction time of our distance-preserving oracle is 0{n -P^) +0{m-min{n, P})- 
The construction time of our path-reporting oracle for sparse graphs is 0{m ■ n) = O(n^A), where 
A = m/n. The construction time of our oracles with nearly-linear space for general graphs is 
(^(^2+i/A:)_ Finally, the construction time of our oracle for unweighted graphs with a hybrid 
multiplicative-additive stretch is (In both cases k is the 

stretch parameter of the respective oracle.) 

1.5 Related Work 

There is a huge body of literature about distance oracles by now. In addition to what we have 
already surveyed there are probe-complexity lower bounds by Sommer et ah |32]. There is an 
important line of work by Patra§cu et al. [26l|25] on oracles with rational stretch. Finally, Baswana 
and Sen m, Baswana and Kavitha [10] and Baswana et al. [9] improved the preprocessing time 
of the TZ oracle. 

1.6 Structure of the Paper 

We start with describing our distance preserving oracle (Section [3]). We then proceed with devis¬ 
ing our basic path-reporting oracle for sparse graphs (Section 0]). This oracle can be viewed as a 
composition of an oracle from Agarwal et al. [6] with our distance-preserving oracle from Section 
|3l The oracle is described for graphs with small arboricity. Its extension to general sparse graphs 
(based on a reduction from j6]) is described in Section [5l Then we devise a much more elaborate 
multi-level path-reporting oracle for sparse graphs. The oracle of [6] and our basic oracle from Sec¬ 
tion 0] both use just one set of sampled vertices. Our multi-level oracle uses a carefully constructed 
hierarchy of sampled sets which enables us to get the query time down from to Next 

we proceed (Section [6]) to using this multi-level oracle for a number of applications. Specihcally, 
we use it to construct a linear-size logarithmic stretch path-reporting oracle with query time rf, 
linear-size polylogarithmic stretch path-reporting oracle with query time O (log log n), and hnally, 
oracles that break the girth barrier for unweighted graphs. Our lower bounds can be found in 
Section [3 

2 Preliminaries 

For a pair of integers a < b, we denote [a, b] = {a, a -|- 1,..., 6}, and [b] = [1, b]. The arboricity 
of a graph G is given by A(G) = max[/cy|f/|>2 where E{U) is the set of edges induced 

by the vertex set U. We denote by degc-(M) the degree of a vertex u in G; we omit G from 
this notation whenever G can be understood from the context. We use the notation 0{f{n)) = 
0(/(?7,)polylog(/(?7,))) and = f2(/(n)/polylog(/(n))). We say that a function /() is quasi¬ 

polynomial if f{n) < 

A distance-labeling scheme for a graph G = (V, E) assigns every vertex v ^ V a short label 
(p{v). Given a pair of labels ip{u),ip{v) of a pair of vertices u,v ^ V, the scheme computes an 
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estimate 6{(p{u), (p{v)). This estimate has to be within a factor a, for some a > 1, from the actual 
distance dciu^v) between u and v in G. The parameter a is called the stretch of the labeling 
scheme, and the maximum number of bits employed by one of the labels is called the (maximum) 
label size of the scheme. 

A closely related notion is that of compact routing scheme. Here each vertex v is assigned 
a label ip{v) and a routing table '0(n). Given a label (p{u) of routing destination u and its own 
routing table '0(n), the vertex v = Vq needs to be able to compute the next hop Vi. Given the table 
tljivi) of Vi and the destination’s label (p{u), the vertex vi computes the next hop ^ 2 , etc. The 
resulting path v = Vq, Ui, V2, ■ ■ ■ has to end up eventually in m, and its length needs to be at most 
a times longer than the length of the shortest u — v path in G, for a stretch parameter a > 1. In 
addition to stretch, another important parameter in this context is the maximum number of bits 
used by the label and the routing table (together) of any individual vertex. This parameter will 
be referred to as maximum memory requirement of a routing scheme. 

3 A Distance-Preserving Path-Reporting Oracle 

Gonsider a directed weighted n-vertex graph G = {V,E,u). (The result given in this section 
applies to both directed and undirected graphs. However, our other distance oracles apply only 
to undirected graphs.) Let Pairs C be a subset of ordered pairs of vertices. We denote 
its cardinality by P = |Pairs|. In this section we describe a distance oracle which given a pair 
{u, v) G Pairs returns a shortest path H^ „ from m to n in G. The query time of the oracle is 
proportional to the number of edges (hops) in The oracle uses 0{n + P^) space. 

The construction of the oracle starts with computing a set Paths = | {u, v) G Pairs} of 

shortest paths between pairs of vertices from Pairs. This collection of shortest paths is required 
to satisfy the property that if two distinct paths H, H' G Paths traverse two common vertices x 
and y in the same order (i.e., e.g., both traverse hrst x and then y), then they necessarily share 
the entire subpath between x and y. It is argued in [H] that this property can be easily achieved. 

We will need the following definitions from [IT] . 

For a path H = (mo,mi, ... ,Mh) and a vertex Ui G H(n), the predecessor of Ui in H, denoted 
predn(Mj), is the vertex Ui-i (assuming that i > 1; otherwise it is dehned as NULL), and the 
successor of Ui in H, denoted succn(rti), is the vertex Ui+i (again, assuming that i < h — 1; 
otherwise it is NULL). 

Definition 3.1 [T^ A branching event (n,n',x) is a triple with H, H' G Paths being two distinct 
paths and x E V (n)nU(H') be a vertex that belongs to both paths and such that {predn(a:), succn(a;)} 7 ^ 
{predn/(x), succn'(a:)}. We will also say that the two paths H, H' branch at the vertex x. 

Note that under this definition if H traverses edges («*_!, Wj), {ui,Ui+i) and H' traverses edges 
(Mj+i,Mj), {ui,Ui_i) then (n,n',Mj) is not a branching event. 

It follows directly from the above property of the collection Paths (see also [H], Lemma 7.5, 
for a more elaborate discussion) that for every pair of distinct paths H, H' G Paths, there are at 
most two branching events that involve that pair of paths. Let B denote the set of branching 
events. The overall number of branching events for the set Paths is \B\ < \Paths\^ = P^. Our 
oracle will keep 0 ( 1 ) data for each vertex, 0 ( 1 ) data for each branching event, and 0 ( 1 ) data for 
each path. Hence the oracle stores 0(n + \B\ + P) data in total. 
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Specifically, in our oracle for every vertex n G 1^ we keep an identity of some path fl G Paths 
that contains v as an internal point, and two edges of If incident on v. (If there is no path 
n G Paths that contains v as an internal point, then our oracle stores nothing for v in this data 
structure.) The path If stored for v will be referred to as the home path of v. 

In addition, for every branching event (If, fl', v) we keep the (at most four) edges of fl and If' 
incident on v. Finally, for every pair (x, y) G Pairs we also store the hrst and the last edges of the 
path Observe that the resulting space requirement is at most 0{n+\B\+P) = 0{n + P‘^). We 
assume that the branching events are stored in a hash table of linear size, which allows membership 
queries in 0(1) time per query. 

The query algorithm proceeds as follows. Given a pair (x, y) G Pairs, we hnd the hrst edge 
{x,x') of the path Ifa, ,^, and ’’move” to x'. Then we check if {x',y) is the last edge of P^x,y If h is 
then we are done. Otherwise let n(a;') denote the home path of x'. (Observe that since the vertex 
x' is an internal vertex in T\.x,y^ it follows that there exists a home path n(x') for x' .) 

Next, we check if n(a;') = P-x,y (This test is performed by comparing the identities of the two 
paths.) If it is the case then we fetch the next edge {x\x") of n(a:'), and move to x”. Otherwise 
(if n(a;') ^ n(a:, y)) then we check if the triple (n(a:'), ^x,y^ x') is a branching event. This check is 
performed by querying the branching events’ hash table. 

If there is no branching event (n(a;'), flj. y, a;') then we again fetch the next edge {x',x") of 
n(x'), and move to x”. (In fact, the algorithm does not need to separate between this case and 
the case that n(a;') = P^x,y We distinguished between these cases here for clarity of presentation.) 

Finally, if there is a branching event (n(a;'), P-x,y, x') then we fetch from our data structure all 
the information associated with this event. In particular, we fetch the next edge {x',x") of T^x,yi 
and move to x". 

In all cases the procedure then recurses with x". It is easy to verify that using appropriate 
hash tables all queries can be implemented in 0(1) time per vertex, and in total 0{\U.x,y\) time. 
We summarize this section with the following theorem. 

Theorem 3.2 Given a directed weighted graph G = (V, E,uj) and a collection Pairs C of pairs 
of vertices, our distance-preserving path-reporting oracle (shortly, DPPRO ) reports shortest paths 
Ilx,y for guery pairs {x,y) G Pairs inO{\Iix,y\) time. The oracle employs 0{n-\-\B\-\-P) = 0(n+P^) 
space, where B is the set of branching events for a fixed set of shortest paths between pairs of vertices 
from Pairs, and P = | Pairs |. 

One can construct the shortest paths in 0(m ■ min{P, n}) time. Then for each vertex v one 
keeps the list of paths that traverse v. For every such path one keeps the two edges of this path 
which are incident on v. In overall 0{n ■ P^) additional time one can use these lists to create the 
list of branching events. A hash table with them can be constructed in additional O(P^) time. 
Hence the overall construction time of this oracle is 0{m ■ min{P, n}) + 0{n ■ P^). 

Observe that if one is given a set S, [S'! = of terminals, then Theorem 13.21 provides 

a linear-size DPPRO (i.e., 0(1) words per vertex on average) which can report shortest paths 
between all pairs of terminals. It is well-known that any distance labeling scheme which is guar¬ 
anteed to return exact distances between all pairs of terminals must use maximum label size 
[33]. This is also the case for compact routing schemes [3l]. (In the latter case the lower 
bound of D(n^/^) is on the maximum memory requirement of any individual vertex.) 
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We remark that our DPPRO here employs 0{n + \B\ + P) space, whereas the underlying 
distance preserver has 0{n + ■ \B\) edges [H]. It is plausible that there exists a DPPRO of 

size 0{n+ ^JW^\B\). We leave this question open. 


4 A Basic Distance Oracle for Graphs with 
Bounded Arboricity 

In this section we describe a basic variant of our path-reporting distance oracle for weighted 
undirected graphs G = {V,E,u) of arboricity A(G) < A, for some parameter A. (We will mostly 
use this oracle for constant or small values of A. On the other hand, the result is meaningful for 
higher values of A as well.) Our oracle reports paths of stretch 0{k), for some positive integer 
parameter k. Unlike the partial oracle from Section |3l the oracle in this section is a full one, 
i.e., it reports paths for all possible queries {u,v) E (^). This is the case also for all our other 
oracles, which will be described in consequent sections. The expected query time of our oracle is 

(9(,7,V2+2fc^. y). 

(Whp0, the query time is Togn-A).) The oracle requires 0(n) space, 

in addition to the space required to store the graph G itself. Observe that for A = 0(1) the query 
time is 0(n^0+<^)^ for an arbitrarily small constant e > 0, while the stretch is 0(A) = 0(1). In 
Section [5] we extend this oracle to general m-edge n-vertex graphs with A = ^. 

Our basic oracle employs just one level of sampled vertices, which we (following the terminology 
of [6]) call landmarks. Each u G U is sampled independently at random with probability where 
p is a parameter which will be determined in the sequel. Denote by L the set of sampled vertices 
(landmarks). Note that IE(|L|) = p. 

For every vertex v E V we keep the path n(u) to its closest landmark vertex i{v), breaking ties 
arbitrarily. Denote by D{v) the length w(n(u)) of this path. This is a collection of vertex-disjoint 
shortest paths trees (shortly, SPTs) {T{u) \ u E L}, where each T{u) is an SPT rooted at u for the 
subset {u I dciu^v) < doin',v),\/u' ^ u,u,u' E L}. (Ties are broken arbitrarily.) This collection 
is a forest, and storing it requires 0(n) space. 

The oracle also stores the original graph G. For the set of landmarks we compute the complete 
graph C = (L, (2), dolL). Here dolL stands for the metric of G restricted to the point set L. (In 
other words, in the landmarks graph C, for every pair u,u' E L of distinct landmarks the weight 
cociuyu') of the edge («,«') connecting them is defined by Uciu^u') = doiu^u').) 

Next we invoke Thorup-Zwick’s distance oracle [33] with a parameter k. (Henceforth we will 
call it the TZ oracle.) One can also use here Mendel-Naor’s oracle [23], but the resulting tradeoff 
will be somewhat inferior to the one that is obtained via the TZ oracle. Denote by P the TZ 
distance oracle for the landmarks graph C. The oracle requires Oik-\L\^~^^^^) space, and it provides 
(2fc — l)-approximate paths Ylu,u' in B for pairs of landmarks E L. The query time is Oik) 
(plus 0(|ntj_u/|)). Observe that some edges of Iiu,u' niay not belong to the original graph G. We 
note also that by using more recent oracles [13] [36] one can have here query time 0(1), but this 
improvement is immaterial for our purposes. 

The TZ oracle "H has a useful property that the union H = IJIH.^ | (u, u') E (2)} of all paths 
that the oracle returns forms a sparse (2/c — l)-spanner. Specihcally, IE(|if |) = 0(fc ■ 

"^Here and thereafter we use the shortcut ”whp” for ’’with high probability”. The meaning is that the probability 
is at least 1 — n”'”, for some constant c > 2. 





(This property holds for Mendel-Naor’s oracle as well, but there the stretch of the spanner is 
0{k), where the constant hidden by the 0-notation is greater than 2. On the other hand, their 
space requirement is 0(|L|^+^/^), rather than 0{k ■ Fix an oracle "H as above for 

\H\ = 0{k ■ Whp such an "H will be computed by running the procedure that computes 

the TZ oracle for 0{logn) times. We will view the spanner if as a collection of pairs of vertices 
of our original graph G. 

Finally, we invoke our distance preserving oracle (shortly, DPPRO) from Section[3]on the graph 
G and set Pairs = H. We will refer to this oracle as V{G,H). Its size is, with high probability, 
0{n + Ifip) = 0{n + e ■ |L|2+2A). Upon a query {y, y') G if, this oracle returns a shortest path 
Tiy^y' between y and y' in G in time 0{\lly^yi\). 

Observe that |f | is the sum of identical independent indicator random variables \L\ = Yhv&v 
where ly is the indicator random variable of the event {u G f}. Hence, by Chernoff’s inequality, 
for any constant e > 0, 

P(|f I > (1 + e)lE(|f D) = P(|f I > (1 + e) • p) < exp(-0(p)) . 

We will set the parameter p to be at least clogn, for a sufficiently large constant c. This will 
ensure that whp \L\ = 0(p), and so |f Set p so that = 0(n), i.e., 

p = n2fc+2 ■ - . This guarantees that aside from the storage needed for the original graph, the total 
space used by our oracle is 0{n). 

This completes the construction algorithm of our oracle. Next we describe its query algorithm. 
We need the following definition. For a vertex v E V, let Ball(u) = {x I dciy^x) < dciy^liy))} 
denote the set of all vertices x which are closer to v than the closest landmark vertex ^{v) to v. 

Given a pair u, v of vertices of G, our oracle starts with testing if u G Ball(u) and if u G Ball(M). 
To test if M G Ball(u) we just conduct a Dijkstra exploration rooted at v in the graph G, until we 
discover either u or f'(u). (Recall that G is stored in our oracle.) If u is discovered before l{v) 
we conclude that u G Ball(u), and return the (exact) shortest path between them. Otherwise we 
conclude that u ^ Ball(u). Analogously, the algorithm tests if u G Ball{u). 

Henceforth we assume that u ^ Ball(u) and v ^ Ball(M), and therefore the two searches returned 
u' = i{u), v' = i{v), and the shortest paths n(M) and n(u) between u and u' and between v and 
v', respectively. (In fact, using the forest of SPTs rooted at landmarks that our oracle stores, 
the query algorithm can compute shortest paths between u and u' and between v and v' in time 
proportional to the lengths of these paths.) Observe that doin', v') < doin', n)+doin, v)+doiv, v'), 
and doin', n), doiv, v') < doin, v). Hence doin', v') <3 ■ doin, v). 

Then the query algorithm invokes the query algorithm of the oracle Ti for the landmarks graph 
C. The latter algorithm returns a path H' = (u' = zq, zi,..., Zh = v') in C between u' and v'. The 
length a;/;(n') of this path is at most i2k — l)-doin',v') < i6k— 3) ■ doin,v). The time required for 
this computation is Oik + h), where |n'| = h. For each edge izi, ^j+i) G H', i G [0, h — 1], we invoke 
the query algorithm of the DPPRO D(G, H). (The edges izi, Zi+i) of the path H' are typically not 
edges of the original graph. H is a. (2/c — l)-spanner of C produced by the oracle l-i. Observe that 
H' C H, and so izi, Zi+r) G H, for every index i E [0,h — 1].) The oracle 'D(G, H) returns a path 
Hj between Zi and Zi+i in G of length uJcizi, Zi+i) = doizi, ^j+i). Let H = Hq ■ Hi ■... ■ H/j-i be the 
concatenation of these paths. Observe that H is a path in G between zq = n' and Zh = v', and 

h—1 h—1 h—1 

a;(n) = ^(:u(ni) = '^doizi, Zi+i) = '^ixdzi, Zi+x) = ixdB') < i^k - 3) ■ doin,v) . 

^=0 2 = 0 2 = 0 
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Finally, the query algorithm returns the concatenated path fl = n(M) ■ II- n(n) as the approximate 
path for the pair u,v. This completes the description of the query algorithm of our basic oracle. 
Observe that 

a;(fl) = C(;(n(n)) + a;(n) + a;(n(n)) < dciu, v) + (6k — 3) ■ dciu, v) + dciu, v) = (6k — 1) ■ dG(u, v) . 

Next, we analyze the running time of the query algorithm. First, consider the step that tests if 
V G Ball(M) and if m G Ball(n). Denote by X the random variable that counts the number of 
vertices discovered by some hxed Dijkstra exploration originated at u before the landmark i(u) 
is discovered. We order all graph vertices by their distance from u in a non-decreasing order, 
i.e., u = Uo,Ui,... ,Un-i, such that dG(u,Ui) < dG(u,Uj) for i < j. (This is the order in which 
the aforementioned Dijkstra exploration originated at u discovers them.) For an integer value 
1 < t < n—1, the probability that X = t is equal to the probability that the vertices uo, ui,..., Ut-i 
are all not sampled and the vertex ut is sampled. Hence X is distributed geometrically with the 
parameter p = p/n. Hence 

n—1 -l 

1E(X) = ^(l-p)Dp.t < - = - . (1) 

t=i ^ P 

Also, obviously for any positive constant c, P(X > -clnn) < (1 — i.e., whp 

A = O(^logn). 

Recall that the graph G has arboricity at most A, and thus any set of n' < n vertices induces 
0(n' ■ A) edges. Hence Dijkstra algorithm traverses expected 0(^X) edges, and whp O(^Alogn) 
edges. In an unweighted graph such exploration requires time linear in the number of edges, and 
in weightecil graphs the required time is 0(^(A -|- logn)) in expectation, and 0(^\ ■ logn) whp. 
(Recall that Dijkstra algorithm that scans a subgraph (W, E') requires time Odi?'! -|- |R'| log | W|)-) 

The second step of our query algorithm queries the distance oracle 'H for the landmarks graph 
C. (The query is (u',v'), u' = (-(u), v' = ^(v).) This query returns a path H' between u' and v' 
in C in time 0(|n'| -I- k). Finally, for each of the h = |n'| edges (zi, Zi+i), z = 0,1,..., h — 1 of 

the path H', the query algorithm invokes our DPPRO TX(G,H) with the query (zi,Zi+i). This 

oracle returns the shortest path H* between Zi and Zi+i in G within time 0(|nj|). Finally, the 
algorithm returns the concatenated path 11 = n(M) • Hq ■ Hi ■ ... ■ n/j_i ■ n(n). The running time 
required for producing the path Hq • ... ■ H/i-i isO(Et-o'|nd) = 0(|fl|), and |n'| < |n|. Hence 
the overall expected running time of the algorithm is 0(^ ■ \ + |n|) for unweighted graphs, and 
is 0(^ ■ (A -|- logn) -|- |n|) for weighted. (Observe that the additive term of 0(k) is dominated by 
0(^ ■ A). Specihcally, we will be using p < n/logn, and k < O(logn).) For the high-probability 
bounds one needs to multiply the hrst term of the running time by an additional (9(logn) factor 
in both the unweighted and the weighted cases. 

k 11 

Now we substitute P = \ - . The resulting expected query time becomes 0(k ■ n 2 + 2 fc +2 . 

A) -|- 0(|n|). We summarize the properties of our basic oracle in the following theorem. 

®One subtlety: we have to avoid scanning too many edges with just one endpoint in Ball(u). We store the edges 
incident to each vertex x in increasing order of their weights, and relax them in that order when x is scanned. As 
soon as an edge (x,y) is relaxed such that the tentative distance to y is greater than dG(u,i(u)) we can dispense 
with relaxing the remaining edges. Alternatively, a modification of the sampling rule which we describe in Section 
[5] also resolves this issue. 
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Theorem 4.1 For an undirected n-vertex graph G of arhoricity A and a positive integer parameter 
k = 1 , 2 ,..., there exists a path-reporting distance oracle of size (whp) 0{n) (in addition to the 
size reguired to store the input graph G) that returns {6k — 1)-approximate shortest paths 11. The 
expected guery time is ■ k ■ X) in unweighted graphs and 0{n^~^^i^ ■ fc • (A + logn)) in 

weighted ones. (The same bounds on the guery time apply whp if one multiplies them by O(logn). 
In addition, in all cases the guery time contains the additive term 0(|n|).^ 

In particular. Theorem 14.11 implies that for any constant e > 0 one can have a path-reporting 
oracle with query time which provides (9(l)-approximate shortest paths for weighted 

undirected graphs. Observe also that for fc = 1 we obtain a 5-approximate path-reporting oracle 
with query time 0{n^/^X). We remark that to get the latter oracle one does not need to use the TZ 
oracle for the landmarks graph C. Rather one can build a DPPRO TL for all pairs of landmarks. 
(In this case p = \L\ = 0{p), |Pairs| = 1 ( 2 )! = 0{p^) = 0{^/n), and so the size of the oracle 

Ti is 0(|Pairsp -\-n) = 0{n).) 

One can build the forest of SPTs rooted at the landmarks in 0{m) time. In additional 0{m-p) = 
0{k ■ m ■ 77 ,^'^^“ 2 fc+ 2 ) time one can construct the metric closure of L, i.e., the graph C. This graph 
has n' = p vertices and m' < p^ edges. In 0{km' ■ = 0{kp‘^~^^^^) = 0{k ■ time one 

can construct the TZ oracle for it. To construct the DPPRO with P = 0{k ■ = 0{k ■ 

pairs one needs 0{n ■ P^) + 0{k ■ m ■ = 0{kf ■ n^) -|- 0{k ■ m ■ time. Hence the 

overall construction time of this oracle is 0{k'^ ■ n^) -|- 0{k ■ m ■ 2 / 0 + 2 ). 

In Section [5] we show (see Corollary 15.ip that Theorem 14.11 extends to general graphs with 
m = X ■ n edges. 


5 An Extension to General Graphs 


In this section we argue that Theorem 14.11 can be extended to general n-vertex graphs G = (V, E, u) 
with m = Xn edges. In its current form the theorem only applies to graphs of arboricity at most 
A. While this is sufficient for our main application, i.e., for Theorem 16.71 our another application 
iTheorem lb.Sp requires a more general result. Our extension is based on the reduction of Agarwal 
et al. [6] of the distance oracle problem in general graphs to the same problem in bounded-degree 
graphs. Our argument is somewhat more general than the one from [6], as it also applies to 
path-reporting distance oracles. We provide our extension for the sake of completeness. 

Given an m-edge n-vertex graph G with A = m/n, we split each vertex Ui into d{u) = 
copies Each copy is now selected independently at random with probability 

p/n, for a parameter p determined in the same way as in Section 01 The original vertex u is 
selected to the landmarks’ set if and only if at least one of its copies (which will also be called 
virtual nodes) is selected. Observe that the rule that we have described is equivalent to selecting 
u with probability d{u) ■ ^ . £. 

The expected number of selected virtual nodes is 


vev 


P 

n 


P 

n 


-deg(M)- 




ffieg(n) 


vev 


v&V 


P_ 

Xn 


vev 


The number |T| of landmarks is at most the number of selected virtual nodes, and so 1E(|L|) < 3p. 
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By Chernoff’s bound, the number of selected virtual nodes is whp 0(p), and so, whp, = 

as well. Hence the size of our oracle remains 0{n). 

The rest of the construction algorithm for our distance oracle is identical to that of Section |H 
(The only change is the distribution of selecting landmarks.) The query algorithm is identical to 
the query algorithm from Section 01 In particular, note that the virtual nodes have no effect on 
the computation, i.e., the returned paths contain only original vertices. 

Next we argue that the expected query time of the modihed oracle is still at most 0(^ ■ A) in 
unweighted graphs, and ■ Alogn) in weighted ones. (As usual, we omit the additive term of 
the number of edges of the returned path.) Specihcally, we argue that the tests if u G Ball{u) and 
if M G Ball{v) can be carried out within the above expected time. 

Let u = Uo,Ui,... ,Un-i be all graph vertices ordered by a Dijkstra exploration originated 
from M, and replace each vertex Ui by its d{ui) copies uf\ ... The copies appear in an 

arbitrary order. Since each virtual node has probability - to be selected independently of other 
vertices, it follows by a previous argument that the number N of virtual nodes that the algorithm 
encounters before seeing a selected virtual node is 0{^). (The algorithm actually explores only 
original vertices. For the sake of this argument we imagine that when the algorithm reaches a 
vertex y it reaches its hrst copy Right after that it reaches the next copy y^‘^\ etc., and 

then reaches After ’’reaching” all these copies the algorithm continues to the next original 

vertex.) 

Denote the original vertices explored by the algorithm ui,U2, ■ ■ ■ and let be a 

selected copy of Ui. (We assume that all copies of Uj, for j < i, are not selected, and all copies 
uf, h' < h, are also not selected.) It follows that N = d{uj) + h. Hence 

IE(E4«y)) < lE(iV) = . 


Hence 


as well. Thus 


2—1 




' i-l 

E ( ^ deg( 
0=1 


.■»(?) "oe 


Observe that the number of edges explored by the algorithm before reaching Ui is at most 
(The only edges incident on Ui explored by the algorithm are edges {uj,Ui), for 
j < i. These edges are accounted for in the above sum of degrees.) Hence the expected number of 
edges explored by the algorithm is O(^). Hence its expected running time is O(^) (respectively, 
0(y ■ logn)) in unweighted (resp., weighted) graphs. The bounds that hold with high probability 
are higher by a factor of O(logu). 


Corollary 5.1 Up to constant factors, the result of Theorem f.l holds for general undirected 
unweighted m-edge n-vertex graphs with m = Xn. For undirected weighted graphs the expected 
guery time becomes ■ k ■ X ■ logn) = ■ k ■ — ■ logn), and the same bound 

applies whp if one multiplies it by another log n factor. 
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Since IE(|L|) = 0(p), the construction time of the oracle is, up to constant factors, the same as in 
Section 01 

This result provides a path-reporting analogue of the result of Agarwal et ah [6], which provides 
stretch 0{k) and query time Their oracle is not path-reporting. Our oracle is path¬ 
reporting, but its query time is signihcantly higher, specihcally it is ■ k ■ X. 

6 Oracles with Smaller Query Time 

In this section we devise two path-reporting oracles with improved query time. The hrst oracle has 
size 0{m + n) (it stores the original graph), and query time X- rf, for an arbitrarily small e > 0. 
The stretch parameter of this oracle grows polynomially with e~^. For the time being we will focus 
on graphs of arboricity at most A. The argument extends to general graphs with m = An in the 
same way as was described in Section [5l Our second oracle has size O(nloglogn) (independent of 
the size of the original graph) and reports stretch-0 (log^°®"‘/® ^ n) paths in O (log log n) time. Both 
draw on techniques used in sublinear additive spanner constructions of [21] • We will later build 
upon the hrst oracle to construct additional oracles that work for dense graphs as well. Like the 
second oracle, these later oracles will not have to store the input graph. 

6.1 Construction of an Oracle with time 0(A • rf) 

In this section we describe the construction algorithm of our oracle. It will use a hierarchy of 
landmarks’ sets Li,L 2 ,... ,Lh, for a positive integer parameter h that will be determined later. 
For each index z G [h], every vertex v is selected into L* independently at random with probability 
Pi = —, Pi > P 2 >...> Ph- The sequence pi, p 2 , ■ ■ ■, ph will be determined in the sequel. The 
vertices of Li will be called the i-level landmarks, or shortly, the i-landmarks. For convenience of 
notation we also denote Lq = V. 

For each vertex v E V and index i E [h], let ii{v) denote the closest z-landmark to v, where 
ties are broken in an arbitrary consistent way. Denote ri{v) = dG{v,ii{v)) the distance between 
V and its closest z-landmark £i{v). Following [29], for a real number 0 < c < 1, let Sf = {m | 
dciv, u) < c-rj(n)} denote the ith c-fraction-balloi v. In our analysis c will be set to either 1/3 or 1. 
Specihcally, denote the one-third-ball oi v, andBalb(t^) = = {u \ dciv^u) < ri{v)} 

denote the ith ball of v. 

For each vertex v E V we keep a shortest path between v and ii{v). (This is a forest of vertex- 
disjoint SPTs rooted at 1-landmarks. For each 1-landmark u', its SPT spans all vertices v E V 
which are closer to u' than to any other 1-landmark.) Similarly, for each i E [h — 1\ and every 
z-landmark u we keep a shortest path between u and its closest (z-f- l)st landmark f'j+i(M) = 

Again, this entails storing a forest of vertex-disjoint SPTs rooted at (z -|- l)-landmarks, for each 
each index i E [h — V\. Overall this part of the oracle requires 0{n ■ h) space. 

For the hth-level landmarks’ set Lh we build a DPPRO Ch described in Section |3l Given a pair 
u, V of h-landmarks this oracle returns a shortest path n(zz, v) between them in time proportional 
to the number of edges in this path, i.e., 0(|n(zz, z;)|). The space requirement of the oracle Ch is 
0{n -|- \Lh\^), and thus we will select ph to ensure that \Lh\^ = 0(rz), i.e., ph will be roughly 
Denote also Vh = i^f') be the set of all pairs of h-landmarks. 
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For each index i G [h — 1], we also build a DPPRO "Dj for the following set Vi of pairs of 
^-landmarks. Each pair of ^-landmarks u,v such that either v G or m G is inserted 

into Vi. 

Similarly to the DPPRO Ch, given a pair (m, v) G Vi for some i G [h — 1], the oracle Vi returns 
a shortest path n(M,n) between u and v in time 0(|n(M,n)|). Our oracle also stores the graph G 
itself. We will later show a variant of this oracle that does not store G (Theorem 16.61) . The size of 
the oracle is 0{n + |Branchj|), where Branch* is the set of branching events for the set P*. Since 
we aim at a linear size bound, we will ensure that |Branch*| = 0{n), for every i G [h — 1]. We will 
also construct a hash table "H* for P* of size 0(|P*|) that supports membership queries to P* in 
0(1) time per query. The resulting h-level oracle will be denoted A/*. 

6.2 The Query Algorithm 

Next, we describe the query algorithm of our oracle A/*. The query algorithm is given a pair 
u = u^^\v = of vertices. The algorithm starts with testing if m G Balli(n) and if n G Balli(M). 
For this test the algorithm just conducts a Dijkstra search from v until it discovers either or 
u (and, symmetrically, also conducts a search from u). 

Observe that by Equation ([T]), the expected size of Balli(n) and of Balli(M) is O(^), and whp 

both these sets have size 0(^ • logn). Hence the running time of this step is, whp, 0(^ ■ A). 
(Specifically, it is 0(^ ■ A ■ logn) in unweighted graphs, and 0{^ ■ logn ■ (A + logn)) in weighted 
ones. The expected running time of this step is smaller by a factor of logn than the above bound.) 

If the algorithm discovers that v G Balli (n) or that u G Balli (n) then it has found the shortest 
path between u and v. In this case the algorithm returns this path. Otherwise it has found 
= ^i(m*'°^) and = f'i(n®). 

In general consider a situation when for some index j, 1 < j < h, the algorithm has already 
computed and In this case, inductively, the algorithm has already computed shortest 
paths n(n(o), n(nW^ y(2))^... ^ n(n(J-i), n(^)) and n(nW, n^^)), n(nW, n^^)),..., n^^)) 

between and and ..., and and and v^'^\ ..., and 

respectively. (Note that the base case j = 1 has been just argued.) 

For j < h, the query algorithm of our oracle A/* then queries the hash table l-Lj whether the pair 
y(i)) g jy. If if ig fPe case then the algorithm queries the oracle Pf, which, in turn, returns 
the shortest path between and in time (9(|n(M*^'^\ n^-^^)!). The algorithm then 

reports the concatenated path 

n(M,n) = n(M(°),M«) 

Computing this concatenation requires 0{j) < 0(|n(M,n)|) time. 

In the complementary case when ^ Vj, the algorithm fetches the prerecorded paths 

and n(nb\-yO'+i))^ and invokes itself recursively on the pair ^ yij+^)y (Recall 

that for each index j, 1 < j < h — 1, the algorithm stores a forest of vertex-disjoint SPTs rooted at 
(j -I- l)-landmarks Tf+i. These SPTs enable us to compute the paths 
for all j G [h — 1], in time proportional to the number of edges in these paths.) 

Finally, if j = h then we query the DPPRO of the graph with the query 
(Note that it is not necessary to query if (m^^\ n^^^) is in the DPPRO £/*, since, by construction, all 
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such pairs are there.) The query returns the shortest path between them in time 

It follows that the overall running time of the query algorithm is dominated by the time required 

to compute and Specihcally, it is 

0(— ■ A) + V (|n(M«,n(*+^))| + |, 

where 1 < j < h is the smallest index such that G Vj. (Recall that for j = h,Vh = (^'') , 

i. e., all pairs of h-landmarks belong toVh-) Hence the overall query time is 0{^-X)+0{\Il{u, n)| + 
h), where n(M,n) is the path that the algorithm ultimately returns. 

Remark: If for each index 0 < j < h — 1 at least one of the subpaths 

is not empty then h < |n(M,n)|, and the resulting query time is 0(^A) + 0(|n(-u,n)|). One 
can artihcially guarantee that all these subpaths will not be empty, i.e., that and 

yU) ^ every j. To do this one can modify the construction slightly so that the set of 

i-landmarks and the set of j-landmarks will be disjoint for all i ^ j. Under this modihcation 
of the algorithm the query time is 0{^ ■ A) + (9(|n(M,n)|), while the stretch guarantee of the 
oracle (which will be analyzed in Section stays the same. This modihcation can make oracle’s 
performance only worse than it is without this modihcation, but the bounds on the query time of 
the modihed oracle in terms of the number of edges in the returned path become somewhat nicer. 
(See Theorem 16.61 1 

6.3 The Stretch Analysis 

Recall that in the case that v G Balli(M) or n G Balli(n) our algorithm returns the exact shortest 
path between u = and v = Hence we next consider the situation when v ^ Balli(M) 
and u ^ Balli(n). For brevity let d = = dciu^v). At this point the algorithm also has 

already computed and along with the shortest paths and be¬ 

tween and and between and respectively. Observe that in this scenario we have 
dG{uP^\u^^'*),dc{vP^\vPP) < d, and so 

dG{u^^\ < dG{u^^\ + dG{u^'^\ + dG{v^^\ < 3 ■ d. 

Hence if G Pi then the path . n(Md)^i;(i)j . n(nd)^i;(o)j returned by the 

algorithm is a 5-approximate path between u and v. Indeed, its length is at most 

dG{u^^\u^^'‘) + dG{u^^\v^^^) + dG{v^^\v^^'‘) < d + 3-d + d = 5 ■ d. 

More generally, suppose the query algorithm reached the j-level landmarks for some 

j, 1 < j < h — 1, and suppose that ^ Vj. This means that ^ and 

^ By dehnition of the one-third-ball it follows that 

dG{u^P,v^P) > dG{u^P,u^^^^'') = ^-rj+ii^u^P), 

dG{u^P,v^P) > ^ ■ rj+i(n^^^) , 
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and 




where (respectively, is the (j + l)-landmark closest to (resp., 

Hence 

< 7 ■ dG{u^^\ . 

Denote by p, 1 < p < h, the index for which the algorithm discovers that E Vp. (Since 

{u^^\ ^(^)) g for every pair of h-landmarks, it follows that the index p is well-dehned.) 

We have seen that dGiu'd^^^v^^^) < 3d, and for every index j, 1 < j < p — 1, dG{u^^~^^\ v^7+'^)'j < 
7 ■ dG{u^^\v^7)y Hence for every j, 1 < j < p, it holds that dG{u^^\v^^^) < 3 ■ ■ d. Denote 

dbO = 3 . yi-i . for 0 < j < p. Also, dG{u^^\ , dG{v^°\ < d = d^^\ and for every index 

j, 1 < j <p - 1, 

dG{u^^\u^^~^^^) < 3 ■ dG{u^^\ <3-d^^^ = 3^-7^~^-d. 


Hence the length of the path 

returned by the algorithm is at most 


'' p—1 


^p-1 


+ 3 ■ d(7) + Sp) + 3 . 


0 = 1 


0 = 1 


d- 1^2 ■ 1^1 + 3 ■ 3 ■ 7^"^ j ) ^ ^ 


= d-(6-7P-'-l) 


Since p < h we conclude that the oracle has stretch at most 6-7" ^ — 1. 


6.4 The Size of the Oracle 

For each index i E [h], our oracle stores a forest of (vertex-disjoint) SPTs rooted at ^-landmarks. 
Each of these forests requires 0{n) space, i.e., together these h forests require 0{n ■ h) space. 

We next set the values pi > p 2 > ... > ph so that each of the auxiliary oracles Vi,V 2 , ■ ■ ■, T>h-i, Ch 
requires 0{n) space. Each of the hash tables "Hi, 772, • • •, Tih associated with these oracles requires 
less space than its respective oracle. Recall that the parameter pi also determines the query time. 

(It is 0(^A) -|- 0(|n|), where H the path returned by the algorithm. In the sequel we will often 
skip the additive term of 0(|n|) when stating the query time.) 

For each i E [h] we write pi = n"% where ctj = 1 — (3/4)^“*+^. Observe that ah = 1/4, i.e., 

ph = n^/^. 

Hence lE(|L;i|) = Ph = and by Chernoff’s bound, whp, \Lh\ = (^(n^/"^). (Recall that \Lh\ is 
a Binomial random variable.) Hence the DPPRO Ch for Vh = {^ 2 ') r^Quir^s space 0{\Lh\^ + n) = 
0{n), whp. 

Next we analyze the space requirements of the oracles T>i, V 2 ,..., Vh-i- Fix an index i E [h—1], 
and recall that the space requirement of the DPPRO Dj is 0{n+ |Branchj| -|- \Vi\), where Branch* 
is the set of branching events for the set Vi of pairs of vertices. Next we argue that (whp) 
|Branchj| = 0{n). Recall that the set Vi contains all pairs of ^landmarks (Mb),nbl) such that 
either e or wb) g Hp(!^i(vb)). 
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The following two lemmas from [22] are the key to the analysis of the oracle’s size. The hrst 
says that with onr definition of Pj+i all branching events are confined to {i + l)st level balls. The 
second bounds the expected number of branching events in terms of the sampling probabilities. 
For completeness, the proofs of these lemmas are provided in Appendix 

1 /s 

Lemma 6.1 Suppose that v G Then if {x,y) G Vi+i and there is a branching event 

between the pairs {u,v) and {x,y) then necessarily x,y E Balh+i(M). 


Lemma 

m = o 


6.2 Whp, |Branchi| = O 




and lE(|Branchj|) 



Moreover, whp 


Observe that with our choice of pi {pi = a* = 1 — (3/4)^ for every i G [h]), it holds 
for every i E [h — 1] that O = Q _ 0{n), and O = 

*). Hence by Lemma 16.21 for each i E [h — 1], the oracle Vi requires expected space 
0{n + I Branch* I + \Vi\) = 0{n). Thus the overall expected space required by our h-level oracle 
oracle Ah (in addition to the space required to store the original graph G) is 0{n- h). Recall that 
the query time is (whp) 0{{n/pi)X) = ■ A). 

The argument described in Section O enables us to extend these results to general m-edge 
n-vertex graphs. 


Theorem 6.3 For any parameter h = 1,2,... and any n-vertex undirected possibly weighted graph 
G with arboricity X, the path-reporting distance oracle Ah uses expected space 0{n- h), in addition 
to the space reguired to store G. Its stretch is (6-7^“^ —1), and its guery time is (whp) A). 

The same result applies for any m-edge n-vertex graph with A = m/n. 

Specihcally, in unweighted graphs with arboricity A the query time is 0((n/pi) • A ■ logn) = 
(^(^(3/4) . ^ . logn), while in weighted graphs it is 0{n^^f^'> ■ (A + logn)logn). In unweighted 
m-edge n-vertex graphs the query time is 0{n^^/^^ ■ ^ ■ logn), while in m-edge n-vertex weighted 
graphs it is 0(n*^^/^^^ • ^ ■ log^n). 

By introducing a parameter t = (4/3)^ we get query time (9(n^/*A), space 0(n ■ logf), and 
stretch at most (The exponent is ~ 6.76.) 

Corollary 6.4 For any constant t of the form t = (4/3)^ (for a positive integer h) and an n- 
vertex graph G with arboricity X, our path-reporting distance oracle Ah uses expected space 0{n) 
(in addition to the space needed to store G). It provides stretch at most guery time 

is (whp) 0(n^/*A). (For a non-constant t the space reguirement becomes 0{n ■ logf).j The same 
result applies for any m-edge n-vertex graph with X = m/n. 

Yet better bounds can be obtained if one is interested in small expected query time. The 
expected query time is dominated by the time required to test if n G Balli(n) and if n G Balli(n). 
For unweighted graphs these tests require 0{^X) = 0{n^^/^'> A) expected time. 

Corollary 6.5 For any t of the form t = (4/3)^, for a positive integer h, and an n-vertex m-edge 
graph G, our path-reporting oracle Ah uses expected 0{n-h) space in addition to the space reguired to 
store G. It provides stretch at most f*°®4/3 expected guery time is 0{n}l^ ■ (m/n) -|-log t) for 

unweighted graphs. In the case of weighted graphs the expected guery time is 0{n^^^{m/n) - logn). 
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Consider now the oracle Ah for a snperconstant nnmber of levels h = log 4 / 3 (logn + 1). Then 
Pi = (2n)"i = n. In other words, all vertices C of G are now defined as the first level landmarks 
(1-landmarks), i.e., Li = V. (For levels i = 2,3,..., h, landmarks Lj are still selected at random 
from V with probability pi/n < 1, independently. For level 1 this probability is 1.) Recall that our 
oracle starts with testing if n G Balli(-u) and if n G Balli(n). Now both these balls are empty sets, 
because all vertices belong to Li. Thus with this setting of parameters the oracle Ah no longer 
needs to conduct this time-consuming test. Rather it proceeds directly to querying the oracle Vi. 
Remarkably, this variant of our oracle does not require storing the graph G. (Recall that the graph 
was only used by the query algorithm for testing if n G Balli(M) and if u G Balli(n).) The query 
time of the new oracle is now dominated by the h queries to the oracles Vi,V 2 , ■ ■ ■, T>h-i, Ch, i.e., 
0{h) = (9(loglogn). Recall that, by the remark at the end of Section [621 can always make 
our oracle to return paths with at least h edges, and thus the 0{h) = O(loglogn) additive term 
in the query time can be swallowed by (9(|n|), where If is the path that our oracle returns. 

Denote by A the oracle which was just described. The stretch of A is (by Theorem 16.3!) 
6 ■ 7^“^ — 1 = 0(log*°®^/3^n). 

Theorem 6.6 The oracle A is a path-reporting oracle with expected space O(nloglogn), where 
n is the number of vertices of its input undirected weighted graph G. Its stretch is n) 

and its guery time is O(loglogn). (It can be made 0(1), but the paths returned by the oracle will 
then contain Q(\og\ogn) edges.) 

Note that by Markov’s inequality. Theorem 16.61 implies that one can produce a path-reporting 
oracle with space O(nloglogn), query time O(log log n) and polylogarithmic stretch by just re¬ 
peating the above oracle-constructing algorithm for O(logn) times. Whp, in one of the executions 
the oracle’s space will be O(nloglogn). Similarly, by the same Markov’s argument. Corollary 16.4! 
implies that whp one can have the space of the oracle Ah bounded by 0{n) (in addition to the 
space required to store the input graph). 

Next we analyze the construction time of our oracle. The h forests rooted at landmarks can 
be constructed in 0{m ■ h) time. We also spend 0{m ■ n) = 0{n^\) time to compute all-pairs- 
shortest-paths (henceforth, APSP). Then for each ball Bi^ifu), u ^ Li, we store all ^landmarks 
that belong to it. They can be fetched from the APSP structure in 0(1) time per ^-landmark. 
The expected size of this data structure is 0{\Vi\) = 0(^^) = 0(n). Then we produce all 
possible quadruples u,v,x,y with v,x,y G Balb+i(M) fl Lj, n G Lj. By the proof of Lemma 1621 

4 

there are expected 0(-§^) = 0(n) such quadruples. For each of these quadruples we check if 

Ci + l 

the involved shortest paths intersect, and compute the corresponding branching events. Since 
the length of each such path is whp 0(-^^ ■ logn), it follows that the entire computation can be 

carried out in 0(^^) expected time. Recall that pj+i = and thus this running time is 

0(71^/*^). In 0(n ■ P^) = O(n^) additional time we construct the DPPRO £h for the set of all 
pairs of h-landmarks. The total expected construction time is therefore dominated by the APSP 
computation, i.e., it is 0(m ■ n). 

6.5 Spanner-Based Oracles 

While the query time of our oracle A is close to optimal (there is an additive slack of O(loglogn)), 
its space requirement 0(n log logn) is slightly suboptimal, and also its stretch requirement is 
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n), instead of the desired O(logn). Next we argue that one can get an optimal space 
0{n) and optimal stretch O(logn), at the expense of increasing the query time to O(n^), for an 
arbitrarily small constant e > 0. 

Given an n-vertex weighted graph G = {V,E,u) we start with constructing an 0{\ogn)- 
spanner G' = (V,H,u) of G with 0{n) edges. (See [7j; a faster algorithm was given in ED. For 
unweighted graphs a linear-time construction can be found in [27] , and a linear-time construction 
with optimal stretch-space tradeoff can be found in [20].) Then we build the oracle for the 
spanner G'. The space required by the oracle is (by Corollary 16.4p 0{n), plus the space required 
to store the spanner G', i.e., also 0{n). Hence the total space required for this spanner-based 
oracle is 0{n). Its stretch is the product of the stretch of the oracle, i.e., at most with 

t = (4/3)^ for an integer h, and the stretch of the spanner, i.e., (9(logn). Hence the oracle’s stretch 
is ■ logn). The oracle reports paths in G' = (V, FT), but since H E, these paths belong 

to G as well. Observe also that the query time of the spanner-based oracle is ■ ^), where 

m' = \H\ is the number of edges in the spanner. Since m' = 0(n), it follows that the query time 
is, whp, 0(n^/*). We remark also that the spanners produced by mm have constant arboricity, 
and thus one does not really need the reduction described in Section O for this result. 


Theorem 6.7 For any constant e > 0, the oracle obtained by invoking the oracle Ah with h = 
[log 4 / 3 e“^] from Corollary \6.4 on a linear-size O(logn)-spanner is a path-reporting oracle with 
space 0{n), stretch O(logn), and guery time 0{nh). 

Generally, we can use an 0{k)-spanner, < k < logn with edges. As a 

result we obtain a path-reporting distance oracle with space stretch 0{k) and guery 

time 


Observe that Theorem 16.71 exhibits an optimal (up to constant factors) tradeoff between the 
stretch and the oracle size in the range < logn. The only known oracle that exhibits 

this tradeoff is due to Mendel and Naor [23]. However, the oracle of [23] is not path-reporting, 
while our oracle is. 

The construction time of this oracle consists of the time required to build the 0(logn)-spanner 
(which is O(n^) [31]) and the construction time of the oracle Ah in G' (which is also O(n^), because 
G' has 0{n) edges). Hence its overall construction time is O(n^). 

In the context of unweighted graphs the same idea of invoking our oracle from Corollary 16.41 
on a spanner can be used in conjunction with (1 -\- e,/9)-spanners. Given an unweighted n-vertex 

graph G = {V,E), let G' = {V,H) be its (1 -f- 5,/5)-spanner, [3 = /3{6,k) = with 

\H\ = 0{f3 ■ edges, for a pair of parameters 5 > 0, k = 1,2,.... (Such a construction 

was devised in [IE]-) For the sake of the following application one can set 5 = 1. Invoke the 
distance oracle from Corollary 16.41 with a parameter t on top of this spanner. We obtain a path¬ 
reporting distance oracle with space 0(/ln^+^/*’) (whp). Its stretch is 

(3{t, k) = 7 k)) = t*°§ 4/3 7 . ^o(iogiogfc)^ q^gj-y time is 0(n^/*+^/^), whp. As long 

1 

as t = o(fc‘°*^4/3'^ the multiplicative stretch is o{k), the additive stretch is still (3{k) = 

while the space is In particular, one can have query time n V J ^ for an 

arbitrarily small constant rj > 0, stretch {o{k),k^^^°Cogk)f space 0 {k^^^°Cogk).^i+i/k'^^ 

Another variant of this construction has a higher query time 0(n'^), for some arbitrarily small 
constant e > 0, but its multiplicative stretch is 0(1). We just set f to be a large fixed constant 
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and consider k ^ d°g4/37_ Then the qnery time is 0{rf) whp (e = t ^), stretch is {0{l),poly{l/e) ■ 
^o(iogiogfc)^^ and space 0{j3 ■ 

Theorem 6.8 For any unweighted undirected n-vertex graph G, any arbitrarily small constant 
e > 0 and any parameter k = 1,2,.. our path-reporting distance oracle has guery time 0(n^) 
(whp), stretch (0(1),/3(A;))) and space 0{l3{k) • (whp), where l3{k) = Another 

variant of this oracle has guery time n V / whp, for an arbitrarily small constant rj > 0, 

stretch (o(fc), and space whp. 

To onr knowledge these are the hrst distance oracles whose tradeoff between mnltiplicative 
stretch and space is better than the classical tradeoff, i.e., 2fc — 1 versns Natnrally, 

we pay by having an additive stretch. By lower bonnds from |33], an additive stretch of fl{k) is 
inevitable for snch distance oracles. 

One can also use a (5 + e,/c‘^^^^)-spanner with edges from [29] instead of (1 + 

e, (^^)‘^*'*°®^^)-spanner with (l 2 ST) 0 (iogfc) 7 ^i+i/fc edges from [18] for our distance oracle. As a result 
the oracle’s space bound decreases to its additive stretch becomes polynomial in k, 

but the multiplicative stretch grows by a factor of 5 + e. In general, any construction of {a,/3)- 
spanners with size 0{S ■ n) can be plugged in our oracle. The resulting oracle will have stretch 
(t'°®4/37 . Q,^^iog4/37 . 0{Sn + n ■ logt), and query time 0(5' ■ 

The construction time of this oracle is the time needed to construct the (1 + e,/3)-spanner G', 
plus the construction of on G'. The construction time of [18] is xhe construction 

time of the oracle on G' is 0{m' ■ n'), where m' = 0{(3 ■ is the number of edges in G', 

and n' = n is the number of vertices in G'. Hence the overall construction time in this case is 
0(/3(fc) • n^+iA) = k^hoAogk)^2+i/k_ 

7 Lower Bounds 

In this section we argue that one cannot expect to obtain distance labeling or routing schemes (see 
Section [2] for their dehnitions) with properties analogous to those of our distance oracles (given 
by Theorem 16.81 and Corollary 16.Sp . We also employ lower bounds of Sommer et ah [32] to show 
that a distance oracle with stretch (0(1), /3(fc)) and space 0{(3{k) for unweighted n-vertex 

graphs (like the distance oracle given by Theorem 16.811 must have query time Q{k). 

7.1 Distance Labeling and Routing 

We start with discussing distance labeling schemes. Suppose for contradiction that there were a 
distance labeling scheme V for unweighted n-vertex graphs with maximum label size 0(n‘+4) and 
stretch (t, t ■ /3{k)), for some hxed function /3{-), and any parameter k. Consider an inhnite family 
of n-vertex unweighted graphs Gn = {V,En) with girth at least t-|-2 and \En\ = ©(n^"*"*^). (Such 
a family can be easily constructed by probabilistic method; see, e.g., na, Theorem 3.7(a). Denser 

p I A, 

extremal graphs can be found in [22l [2T].) There are 2^^^^ ^ different subgraphs of each Gn. 

To achieve stretch t, one would need 2®© distinct encodings for these graphs, i.e., the total 
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label size for this task is and the maximum individual label size is (See. e.g., 

[35] . Chapter 5, for this lower bound.) 

Replace every edge of G = by a path of length lOf ■ (3(k), consisting of new vertices. The 

new graph G'^ has N = ■ t ■ (3{k)) vertices. Invoke the distance labeling scheme T) on 

G'^. For a pair of original vertices u^v (vertices of G„), the distance between them in G'^ is 
d'{u,v) = 10t(3{k) ■ dciuyv). Given their labels (p{u) and (p{v), the labeling scheme V provides us 
with an estimate 6{(p{u),(f{v)) of the distance between them in which satishes: 

5{ip{u), ^p{v)) < t ■ d\u, v) + t ■ (3{k) = {10t(3{k) ■ dciu, v)) ■ t + t ■ f3{k) . 

On the other hand, a path of length dciu, n) ■ t + 1 in G between u and v translates into a path 
of length at most 


lOf • f3{k){dG{u, v) ■ t + 1) = 10t‘^(3{k)dG{u, v) + 10f/5(/c) 

between them in G'^. Hence the estimate provided by T) corresponds to a path between u and v 
of length at most dciu, v) - t in Gn, i.e., via V we obtain a f-approximate distance labeling scheme 
for Gn- 

The maximum label size used by T3> is 

0(A^‘+4) = 0((n‘+2 • t ■ /9(fc))^) = 0(n(*+2)(*+'‘) • (/3(fc))t+4) . 

However, by the above argument, this label size must be ^(ni+z). Note that 

t-|-3 1 1 

n(‘+2)(‘+4) (^/3(k))'^ < , 

as long as f3{k) < n. This condition holds for any constant k and hxed function /5(-), and also for 
any k = Oiiogn) and quasi-polynomial function /3(-). (Recall that in all relevant upper bounds for 
spanners/distance oracles/distance labeling schemes, it is always the case that k = O(logn) and 
/3(-) is at most a quasi-polynomial function of k.) Hence this is a contradiction, and there can be 
no distance labeling scheme for unweighted graphs with label size 0{n'^) and stretch {t,t ■ I3{k)), 
for any parameter k. 

The same argument clearly applies to routing schemes as well. The only difference is that one 
needs to use lower bounds on the tradeoff between space and multiplicative stretch for routing 
due to [28l [3ll [2], instead of analogous lower bounds of [33] for distance labeling. 

To summarize, while Theorem [6]8] provides a distance oracle with stretch {t, t-(3{k)) and average 
space per vertex of 0{{3{k) ■ for k 3> distance labeling or routing one needs at 

least space per vertex to achieve the same stretch guarantee. 

Similarly, one cannot have a distance labeling scheme for sparse graphs (graphs G = (V, E) 
with edges, for some k > 1) with maximum label size and stretch 0(f), for a 

parameter f -C fc. [j A distance labeling scheme as above requires maximum label size of 
as otherwise one would get a distance labeling with stretch {t,t ■ poly(fc)) for general graphs with 
maximum label size contradiction. 

^Recall that by Corollary 16.51 a path-reporting distance oracle of total size 0(n^“''^/^) with stretch 0(t) and 
query time |n(M,'(;)|) (for a query u,v; the constant c is given by c = logy4/3) does exist. 
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7.2 Distance Oracles 


Next we argue that in the cell-probe model of computation (cf., [2l]), any distance oracle with size 
and stretch like in Theorem 16.81 fi.e.. size and stretch (0(1),/3(fc)), for a hxed function 

/3(-)) must have query time VL{k). We rely on the following lower bound of [32]. 

Theorem 7.1 A distance oracle with stretch t using query time q requires space S > / logn 

in the cell-probe model with w-bit cells, even on unweighted undirected graphs with maximum degree 
at most {t ■ q ■ where t = and c is a positive constant. 


Suppose for a contradiction that there exists a distance oracle with stretch (t,t ■ (3{k)), for a 

pair of parameters t k and a hxed function /3(-), with space at most n ‘ /logn (and query 
time q) for general unweighted graphs. 

Let G = (V, E) be an n-vertex unweighted graph with maximum degree at most {t ■ q ■ 
and let G' be the graph obtained from G by replacing each edge of G by a path of length lOt ■ (d{k). 
The graph G' has N < (t ■ q ■ ■ f3{k) ■ n vertices, and an oracle with stretch (f, t ■ f3{k)) for G' 

can be used also as a stretch-f oracle for G. The size of this oracle is, by our assumption, at most 


{n ■ {t ■ q ■ ■ f3{k)Y^ n 


1 +^ 
- t-q 


/2 


log 


< 


logn 


■ {{t ■ q ■ f3{k))^~^ *^■‘1 


„ - . -1 I c/2 c/2 

As long as {{t ■ q ■ ■ (d{k)) < n~^, i.e., as long as 


{{t ■ q ■ ■ I3{k))-A'^+^ < n , ( 2 ) 

we have a contradiction to Theorem 17.11 (As the oracle uses less than /logn space and has 
stretch t and query time q.) 

For k being at most a mildly growing function of n (specihcally, k < log*’n, C, < 1/2), t = o(k), 
q < k, w = O(logn), and (d{-) being a polynomial (or even a quasi-polynomial) function, the 
condition ([2|) holds. Hence in this range of parameters, any distance oracle for unweighted graphs 

with stretch {t, t-f3{k)) and query time q requires space S > n / logn in the cell-probe model 
with w-bit cells, assuming t = o(-, -—). 

’ ® ^ iogtD+logiogn^ 

c/2 

So if this oracle uses S = 'P{k)) space, then it holds that Aogn-/3{k) > , 

i.e., 

loglogn + log/3(fc) c/2 

1 + 1 k + - 1 - ^1 + 7 — > 

log n t-q 

and so q = kl{k/t). 

We summarize this lower bound in the next theorem. 


Theorem 7.2 Let k < log^n, for any constant ( < 1/2, t = o{k), w = O(logn), and f3{-) being 
a polynomial or a quasi-polynomial function. In the cell-probe model with w-bit cells any distance 
oracle for general unweighted undirected n-vertex graphs with space 0{l3{k) ■ n^+^/^) and stretch 
{t, t ■ (3{k)) has query time q = fl{k/t) = fl{k). 

Theorem 17.21 states that in contrast to distance oracles with multiplicative stretch which can 
have constant query time (see mm), a distance oracle with stretch {0{1), (3{k)) (like the one 
given by our Theorem 16.81) must have query time Q{k). 
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Appendix 


A Missing proofs 

In this section we provide proofs of Lemmas 16.11 and 16.21 

Proof of Lemma l6.lt Suppose for contradiction that there exists a pair (x, y) G Pj+i such that 
the pairs (m,v), {x,y) participate in a branching event /9, and such that either x ^ Ballj+i(M) or 
y ^ Balb+i(M). Then (3 = (n(M, n), n(a;, y), 2 ;), where n(-u,n) (respectively, n(a;,?/)) is a shortest 
path between u and v (respectively, between x and y), and 2 ; is a node at which these two paths 
branch. Since {x,y) G Vi+i it follows that either y G bI!^i{x) or x G B\(^i{y). Without loss of 
generality suppose that y G B\^-^{x). 

The proof splits into two cases. In the first case we assume that x ^ Balb+i(-u), and in the 
second we assume that y ^ Balb+i(M). (Note that roles of x and y are not symmetric.) In both 
cases we reach a contradiction. 

We start with the case x ^ Balb+i(M). Observe that dc^x^z) < dc{x,y) < | ■ rj+i(x) and 
dciu, z) < dciuyv) < I ■ ri+i{u). Denote S = ^^(w, = ri+i{u), where = ii+i{u). 

Denote also 6' = dciuyx). Observe that ri+i{x) < dG{x,u^^+^'>) < 6 + 6', and also (since x ^ 
Balb+i(M)) 6' = dciuyx) >6 = ri+i{u). Then 

dG{u,z) + dG{z,x) < ^ ■ ri+i(M) + ^ ■ ri+i(x) < ^ + ^'('^ + '^0 - = dG{u,x) . 

Hence dG{u,z) + dG{z,x) < dG{u,x), contradicting the triangle inequality. 

We are now left with the case that x G Ballj+i(M), but y ^ Ballj+i(M). Then dG{y,z) < 
dG{x,y) < ^■rj+i(x). Also, dG{u,z) < dG{u,v) < ■|■rj+l(■u). In addition, rj+i(x) < ^^(x,< 
dG{x,u) + rj+i(-u) < 26. (Note that dG{x,u) < 6 = rj+i(-u), because x G Balb+i(-u).) Hence 

dG{u,z) + dG{z,y) < ^ ■ (ri+i(M)+ri+i(x)) < ^-((5 + 25) = 6 < dG(u,y) . 

(The last inequality is because, by an assumption, y ^ Balb+i(M).) This is, however, again a 
contradiction to the triangle inequality. | 

Proof of Lemma 16.21 Recall that (see [H], Lemma 7.5) each pair (u,v), (x, y) may produce at 
most two branching events. Hence next we focus on providing an upper bound on the number of 
intersecting pairs of paths n(M, n), n(x, ?/) for (u,v), (x,y) G Pi. 

By the previous lemma, for a pair (u,v), (x,y) to create a branching event there must be one 
of these four vertices (without loss of generality we call it u) such that the three other vertices 
belong to Balb+i(M). Hence the number of intersecting pairs as above is at most (a constant 
factor multiplied by) the number of quadruples (u,v,x,y) with v,x,y G Balli+i(M). For a fixed 

^-landmark u, the number of vertices in its (i + l)st ball Ballj+i(M*'*^) is, whp, O • logn^. (This 

random variable is distributed geometrically with the parameter p = ^^.) Each of the vertices in 
Balb+i(M) has probability ^ to belong to Li, independently of other vertices. Hence, by Chernoff’s 

bound, whp, there are ^ - O ( ■ logn j = O { ■ logn j ^-landmarks in Balb+i(M). (We select 


1 










the constant c hidden by the 0-notation in O ■ log nj to be sufficiently large. Then the 
expectation is c ■ ■ logn > c ■ logn. Hence the Chernoff’s bound applies with high probability.) 

Hence the number of triples n, x, y of i-landmarks in Ballj+ifti) is, whp, O ( ■ log^ n). The 

V^i+i / 

number of i-landmarks u is, by the Chernoff’s bound, whp, 0{pi). Hence the number of quadruples 
as above is, whp, at most 


0(p.) ■ O 




o(|-. I0g3„) 


Also, the number of pairs \Vi\ is at most the number of i-landmarks (whp, it is 0{pi)) multiplied by 
the maximum number of ^-landmarks in an (i-l-l)-level ball Balh+i(M) (whp, it is O ■ log^^), 

i.e., \Vi\ = O - logn). 

Next we argue that the expected number of quadruples (m, n, x, y) of ^-landmarks such that 
v,x,y E Balh+i(M) is O (-^') and that lE(|Pj|) = O 

For a hxed vertex u, write X{u) = I{{u G Li}) ■ Y{u), where Y{u) is the number of triples of 
distinct i-landmarks different from u which belong to Balh+i(M), and I{{u G Li}) is the indicator 
random variable of the event {u G L^}. (Note that the ball is dehned even if u ^ Li.) Observe 
that the random variables I{{u G L^}) and Y{u) are independent, and thus 

1 E(X(m)) = 1 E(/({m G L,})) ■ 1 E(F(m)) = ^■ 1 E(H(m)). 

n 


Let a = (ui, ^ 2 ,..., Vn-i) be the sequence of vertices ordered by the non-decreasing distance from 
u. (They appear in the order in which the Dijkstra algorithm initiated at u discovers them.) For 
A; = 3,4, ...,n — 1, denote by Jk the random variable which is equal to 0 if Vk+i is not the hrst 
vertex in a which belongs to Lj+i. If Vk+i is the hrst vertex as above then is equal to the 
number of triples , "yja > Os > 1 ^ Ji < < js Y k such that On 02)03 ^ Also, for each 

quadruple 1 < ji < j 2 < js < j 4 : Y n — 1 of indices, dehne J(ji, j 2 , js) J 4 ) to be the indicator 
random variable of the event that Oi) 02)03 ^ O 4 ^ Aj+i, and for each j, 1 < j < j 4 , the 
vertex vj is not an [i + l)-landmark. Observe that 


Also, 


lE(J(jl,j2,j3,j4)) 


. 0+1 

nJ V n J n 


1E(J4) — ^ lE('^(ji02,is; ^ + 1)) 

I<il<t2<j3<fc 

Note that Y(u) = Ylk =3 and so 
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Denote A = 10^^. For k < A, since (1 — = 0(1), it follows that 

pi+l — ’ ' n ^ V /) 


Also, 


E 

k=3 


E 

k=A+l 


k\ fP: 


n 


1 - 


Pi+iY Pi+i 


n 


J 


= O 


n 


k\ (P. 


n 


^ Pi+l \ Pi+l ^ Q 

n J n ~ 


Denote 7 = 1 — pj+i/n. Then 


E £ 


k=A+l 


d7^ 


E < 


d3 


Pi • Pi+l \ 

/ 


Pi • Pi+l 


n’ 


6 


Efc* = o 


fc=3 


k=A+l 


d7^ 1 — 7 (1 — 7)^ 


E 

k=A+l 


= o 


Hence 


pI \ 

pf+J 


1 - 


0+1 


n 


n 


Pi+l 


E 

k=A+l 


k\ /oy. 


n 


Pi+iy P*+i 


n 


-) • 


= O 


n 






n 


Pi+l 


= O 


Pi 


P?+i 


and so 1 E(E(m)) = O(^). Hence 1 E(X(m)) = f ■ 1 E(E(m)) = 0(-^ ■ i). 

^i + 1 ^i + 1 

Finally, the overall expected number of quadruples (m, n, x, y) of ^-landmarks such that v,x,y E 

4 

Balb+i(M) is, by linearity of expectation, at most 1 E(X(m)) = 0(-^). 


A similar argument provides an upper bound of O 


vev — 

(A- 


V Pi+l 


on the expected number of pairs \Vi\. 


We shortly sketch it below. 

For a vertex m, let X'{u) = I{{u G Lj}) ■E'(m), where Y'{u) is the number of ^-landmarks which 
belong to Balh+i(M). Clearly, 1 E(/({m G Li})) = pi/n, and the two random variables (/({m G Li}) 
and Y'{u)) are independent. For every integer /c > 1, let be a random variable which is equal 
to 0 if Vk+i is not the hrst vertex in a which belongs to Lj+i. Otherwise it is the number of 
^-landmarks among ui, ^ 2 , • • • Wfc- For integer ji,j 2 , 1 < Ji < J 2 < ''r — 1, let J'(ji,j 2 ) be the 
indicator random variable of the event that Vj^ G Tj, Vj^ G Tj+i, and for every j < j 2 , it holds that 
Vj ^ Tj+i. Then 

nj'Uuj2)) = 

n \ n / 


Hence 


and 


IE(J2) = E iEO'(ji. *; + !)) 

l<jl<k 


n 

Pi ' Pi+l 
in? 


n 


■k- 1- 


Pi+iY 


n 


Write A = 10^ 

pi+l 


IE(1"(«)) < 5^IE(J^) 

k=l 

, and 


pi ' pi+l 
r? 


k=l 




k=l 


k=l 


k>A 


Pi+iY 


n 


hi 


n 



Each term of the hrst sum is 0(1), and thus the first sum is at most O(A^) = Oiv? j The 
second sum is at most ^ ~ well. Hence 


lE(F'(u)) 


Pi • Pi+1 Q 

9 ^ 




Hence 1 E(X'(m)) = 0(p^/(pj+in)), and by linearity of expectation we conclude that lE(|Pj|) < 

E„euIE(X'(u)) = 0(p2/A+i). I 


IV 



